Tidycensus



Selecting Dataset

The Census Bureau only released APIs for the following time frames:

get_acs()

  • American Community Survey 1-Year Data (2011-2017)
  • American Community Survey 3 Year Data (2012-2013)
  • American Community Survey 5-Year Data (2009-2017)

get_decennial()

  • Decennial US Census (1990, 2000, 2010)

Selecting Geograpghic Scale

Geography Definition
us ACS only. United States
region ACS only. Census regions are groupings of states and the District of Columbia that subdivide the United States for the presentation of census data. The Census Bureau defines four census regions and identifies each one with a single-digit census code- Northeast (1), Midwest (2), South (3), and West (4). Puerto Rico and the Island Areas are not part of any census region.
division ACS only. Census divisions are groupings of states and the District of Columbia that subdivide the census regions for the presentation of census data. The Census Bureau defines nine census divisions and identifies each one with a single-digit census code- New England (1), Middle Atlantic (2), East North Central (3), West North Central (4), South Atlantic (5), East South Central (6), West South Central (7), Mountain (8), and Pacific (9). Puerto Rico and the Island Areas are not part of any census division.
state States and equivalent entities are the primary governmental divisions of the United States. In addition to the 50 states, the Census Bureau treats the District of Columbia, Puerto Rico, American Samoa, the Commonwealth of the Northern Mariana Islands, Guam, and the U.S. Virgin Islands as the statistical equivalents of states for the purpose of data presentation.
county Counties and their equivalents are the primary division of states providing complete coverage of a state. In most states, counties are the primary legal division. Louisiana has parishes; Alaska has boroughs, city and boroughs, municipalities, and census subareas; Maryland, Missouri, Nevada, and Virginia have independent cities in addition to counties; Puerto Rico has municipios; American Samoa has districts and island; the Commonwealth of the Northern Mariana Islands have municipalities; the U.S. Virgin Islands have islands; and the District of Columbia and Guam are also county equivalents.
county subdivision County subdivisions are the primary divisions of counties and statistical equivalent entities. They are either legal entities (minor civil divisions) or statistical entities (census county divisions, census subareas, and unorganized territories). The two main types of county subdivisions are minor civil divisions (MCD) and census county divisions (CCD). MCDs are legal entities that provide governmental and/or administrative services, and are most commonly known as towns and townships. CCDs are statistical entities established cooperatively by the Census Bureau and state and local officials. Census subareas occur only in Alaska and unorganized territories (UT) are defined by the Census Bureau for areas where portions of counties, or equivalent entities, are not already included in an MCD.
tract (must include state) Census tracts are small, relatively permanent statistical subdivisions of a county or county equivalent and generally have a population size between 1,200 and 8,000 people, with an optimum size of 4,000 people. The Census Bureau created census tracts to provide a stable set of boundaries for statistical comparison from census to census. Census tracts occasionally split due to population growth or merge when there is substantial population decline. Local governments have the opportunity to review census tract boundaries prior to each decennial census through the Participants Statistical Areas Program. In general, census tracts can either merge or split based on population changes but cannot be completely changed in order to maintain data comparability over time. If a local participant is not available, the Census Bureau will update the boundaries.
block group (must include state) Block groups are statistical divisions of census tracts and generally contain between 600 and 3,000 people. Local participants delineate most block groups prior to each decennial census through the Participant Statistical Areas Program. If a local participant is not available, the Census Bureau will define them. Since block groups consist of a cluster of census blocks, they control block numbering. At the time of block delineation, all of the blocks that fall within a block group will start with the same number. For example, census blocks 2001, 2002, 2003, … 2999 in census tract 1310.02 belong to block group 2.
block (must include state and county) Decennial only. Census Blocks are the smallest geographic areas that the Census Bureau uses to tabulate decennial data. Blocks are statistical areas bounded by visible features, such as streets, roads, streams, and railroad tracks, and by nonvisible boundaries, such as selected property lines and city, township, school district, and county limits. Generally, census blocks are small in area; for example, a block in a city bounded on all sides by streets. Census blocks cover the entire territory of the United States, Puerto Rico, and the Island Areas. Census blocks nest within all other tabulated census geographic entities and are the basis for all tabulated data. Blocks are defined once a decade and data are available only from the decennial census 100% data (age, sex, race, Hispanic/Latino origin, relationship to householder, and own/rent house). In between decennial censuses, blocks may split due to boundary changes but these blocks do not have any statistical data associated with them available.
place The Census Bureau identifies two types of places; incorporated places and census designated places (CDP). Incorporated places are legal areas and provide governmental functions for a concentration of people. They are usually cities, towns, villages, or boroughs. CDPs are statistical areas delineated to provide data for a settled community, or concentration of population, identifiable by name but not legally incorporated. There is no minimum population threshold for a CDP and they are delineated with the help of local governments, through the Participant Statistical Areas Program, usually once a decade.
metropolitan statistical area/ micropolitan statistical area The Office of Management and Budget (OMB) defines metropolitan and micropolitan statistical areas, collectively known as core based statistical areas (CBSA), based on data from the decennial census and commuting data from the American Community Survey. Each metropolitan or micropolitan area consists of one or more whole counties and includes the counties containing the core urban area, as well as any adjacent counties that have a high degree of social and economic integration (as measured by commuting to work) with the urban core. Metropolitan statistical areas are based on urbanized areas of 50,000 or more people and micropolitan statistical areas are based on urban clusters of at least 10,000 but less than 50,000 people.
combined statistical area ACS only. Combined statistical areas (CSA) consist of two or more adjacent metropolitan and micropolitan statistical areas that have substantial employment interchange. The metropolitan and micropolitan statistical areas that combine to create a CSA retain separate identities within the larger CSA.
urban area ACS only. Urban Area is the term for urbanized areas (UAs) and urban clusters (UCs). UAs consist of densely developed area that contains 50,000 or more people. UCs consist of densely developed area that has a least 2,500 people but fewer than 50,000 people. The Census Bureau defines urban areas once a decade after the population totals for the decennial census are available, and classifies all territory, population, and housing units located within a UA or UC as urban and all area outside of a UA or UC as rural. Urban areas are used as the cores on which core based statistical areas are defined.
congressional district Congressional districts are electoral districts that elect a single member of congress to the House of Representatives. There are 435 congressional districts in the U.S. and the Census Bureau’s decennial census counts determine the number of congressional districts given to each state. The District of Columbia, Puerto Rico, and each Island Area (American Samoa, Guam, Northern Mariana Islands, and U.S. Virgin Islands) are assigned one nonvoting delegate each.
public use microdata area ACS only. Public use microdata areas (PUMAs) are geographic areas defined to be used with public use microdata sample (PUMS) files. PUMAs are a collection of counties or tracts within counties with more than 100,000 people, based on the decennial census population counts. State partners define PUMAs once a decade after the decennial census. Data for PUMAs are available from the American Community Survey (ACS). PUMS files are available from the ACS and decennial census.
zip code tabulation area ZIP Code® tabulation areas (ZCTAs) are generalized areal representations of U.S. Postal Service (USPS) ZIP Code service areas. The Census Bureau collects ZIP Code data for housing units and many non-residential addresses from the USPS and from various field operations. Based on this ZIP Code data, the Census Bureau aggregates census blocks with the same ZIP Code to form ZCTAs. The Census Bureau then labels each ZCTA with the five-digit ZIP Code number used by the USPS.

- Codes

State FIPS

State Code State Code State Code State Code
Alabama 01 Alaska 02 Arizona 04 Arkansas 05
California 06 Colorado 08 Connecticut 09 Delaware 10
District of Columbia 11 Florida 12 Georgia 13 Hawaii 15
Idaho 16 Illinois 17 Indiana 18 Iowa 19
Kansas 20 Kentucky 21 Louisiana 22 Maine 23
Maryland 24 Massachusetts 25 Michigan 26 Minnesota 27
Mississippi 28 Missouri 29 Montana 30 Nebraska 31
Nevada 32 New Hampshire 33 New Jersey 34 New Mexico 35
New York 36 North Carolina 37 North Dakota 38 Ohio 39
Oklahoma 40 Oregon 41 Pennsylvania 42 Rhode Island 44
South Carolina 45 South Dakota 46 Tennessee 47 Texas 48
Utah 49 Vermont 50 Virginia 51 Washington 53
West Virginia 54 Wisconsin 55 Wyoming 56 American Samoa 60
Guam 66 Northern Mariana Islands 69 Puerto Rico 72 U.S. Minor Outlying Islands 74
U.S. Virgin Islands 78 All States + DC 1:56 All State - DC 1:10, 12:56

Tract Numbers


Selecting Variables

- ACS tables

Element 1: Type of Table

Code Title Definition
B Detailed Base Table Most detailed estimates on all topics for all geographies
C Detailed Collapsed Table Similar information from its corresponding Base Table (B) but at a lower level of detail because one or more lines in the Base Table have been grouped together
K20 Supplemental Table The only source of 1-year statistics for selected geographies with populations of 20,000-64,999
S Subject Table A span of information on a particular ACS subject, such as veterans, presented in the format of both estimates and percentages
R Ranking Table State rankings across approximately 90 key variables
GCT Geographic Comparison Table Comparisons across approximately 95 key variables for geographies other than states such as counties or congressional districts
DP Data Profile Broad social, economic, housing, and demographic information in a total of four profiles
NP Narrative Profile Summaries of information in the Data Profiles using nontechnical text
CP Comparison Profile Comparisons of ACS estimates over time in the same layout as the Data Profiles
S0201 Selected Population Profile Broad ACS statistics for population subgroups by race, ethnicity, ancestry, tribal affiliation, and place of birth

Element 2: Subject (B, C, K20, S, R, and GCT Tables Only)

Code Subject
01 Age; Sex
02 Race
03 Hispanic or Latino Origin
04 Ancestry
05 Citizenship Status; Year of Entry; Foreign Born Place of Birth
06 Place of Birth
07 Migration/Residence 1 Year Ago
08 Commuting (Journey to Work); Place of Work
09 Relationship to Householder
10 Grandparents and Grandchildren Characteristics
11 Household Type; Family Type; Subfamilies
12 Marital Status; Marital History
13 Fertility
14 School Enrollment
15 Educational Attainment; Undergraduate Field of Degree
16 Language Spoken at Home
17 Poverty Status
18 Disability Status
19 Income
20 Earnings
21 Veteran Status; Period of Military Service
22 Food Stamps/Supplemental Nutrition Assistance Program (SNAP)
23 Employment Status; Work Status Last Year
24 Industry, Occupation, and Class of Worker
25 Housing Characteristics
26 Group Quarters
27 Health Insurance Coverage
28 Computer and Internet Use
29 Citizen Voting-Age Population
98 Quality Measures
99 Allocation Table for Any Subject

Element 3: Table Number within a Subject

A 2-3 digit number that uniquely identifies the table within a given subject

Element 4: Race Iteration (Selected Tables Only)

Code Population
A White Alone
B Black or African American Alone
C American Indian and Alaska Native Alone
D Asian Alone
E Native Hawaiian and Other Pacific Islander Alone
F Some Other Race Alone
G Two or More Races
H White Alone, Not Hispanic or Latino
I Hispanic or Latino

Element 5: Identification for Puerto Rico Geographies (Selected Tables Only)

For selected tables, a final alphabetic suffix “PR” follows to indicate a table is available for Puerto Rico geographies only.

Useful Tables

Code Description
B01001 Sex by Age: total population, male population, female population, age counts
B03002 Hispanic or Latino by Race: Non-Hispanic race counts to avoid double counting when Hispanic is a category
B17001 Poverty Status in the Past 12 Months by Sex by Age: poverty count by age, sex, or both
B23025 Employment Status for the Population 16 Years and Over: unemployment rate
B23002_ **Sex by Age by Employment State for the Population 16 Years and Over (_race Only):** disaggregated unemployment rate
B25002 Occupancy Status: count of vacant, occupied, and total housing units
B25003 Tenure: count of owner-occupied and renter-occupied units
B25064 Median Gross Rent (Dollars): rent estimate
B25071 Median Gross Rent as a Percentage of Household Income in the Past 12 Months: rent burden estimate

- Decennial tables

Code Description
P Population tables
H Housing tables
PCT/HCT Population or housing tables that cover geographies to the census tract level
PCO/HCO Population or housing tables that cover geographies to the county level
PL Tables derived from the Redistricting Data (P.L. 94-171) Summary File (Census 2000 only)

Useful Tables

Code Description
P12 Sex by Age: total population, male population, female population, age counts
P5 Hispanic or Latino by Race: Non-Hispanic race counts to avoid double counting when Hispanic is a category
H3 Occupancy Status: count of vacant, occupied, and total housing units
H4 Tenure: count of owner-occupied and renter-occupied units

- Finding variables

name label concept
B00001_001 Estimate!!Total UNWEIGHTED SAMPLE COUNT OF THE POPULATION
B00002_001 Estimate!!Total UNWEIGHTED SAMPLE HOUSING UNITS
B01001_001 Estimate!!Total SEX BY AGE
B01001_002 Estimate!!Total!!Male SEX BY AGE
B01001_003 Estimate!!Total!!Male!!Under 5 years SEX BY AGE
B01001_004 Estimate!!Total!!Male!!5 to 9 years SEX BY AGE

name label concept
B19001_001 Estimate!!Total HOUSEHOLD INCOME IN THE PAST 12 MONTHS (IN 2016 INFLATION-ADJUSTED DOLLARS)
B19001_002 Estimate!!Total!!Less than $10,000 HOUSEHOLD INCOME IN THE PAST 12 MONTHS (IN 2016 INFLATION-ADJUSTED DOLLARS)
B19001_003 Estimate!!Total!!$10,000 to $14,999 HOUSEHOLD INCOME IN THE PAST 12 MONTHS (IN 2016 INFLATION-ADJUSTED DOLLARS)
B19001_004 Estimate!!Total!!$15,000 to $19,999 HOUSEHOLD INCOME IN THE PAST 12 MONTHS (IN 2016 INFLATION-ADJUSTED DOLLARS)
B19001_005 Estimate!!Total!!$20,000 to $24,999 HOUSEHOLD INCOME IN THE PAST 12 MONTHS (IN 2016 INFLATION-ADJUSTED DOLLARS)
B19001_006 Estimate!!Total!!$25,000 to $29,999 HOUSEHOLD INCOME IN THE PAST 12 MONTHS (IN 2016 INFLATION-ADJUSTED DOLLARS)
name label concept
B02001_007 Estimate!!Total!!Some other race alone RACE
B02001_008 Estimate!!Total!!Two or more races RACE
B02001_009 Estimate!!Total!!Two or more races!!Two races including Some other race RACE
B02001_010 Estimate!!Total!!Two or more races!!Two races excluding Some other race, and three or more races RACE
B03002_008 Estimate!!Total!!Not Hispanic or Latino!!Some other race alone HISPANIC OR LATINO ORIGIN BY RACE
B03002_009 Estimate!!Total!!Not Hispanic or Latino!!Two or more races HISPANIC OR LATINO ORIGIN BY RACE

Getting Data

- Source

GEOID NAME Population
01 Alabama 4779736
02 Alaska 710231
04 Arizona 6392017
05 Arkansas 2915918
06 California 37253956
GEOID NAME Female_Newborn_Estimate Female_Newborn_MOE
36047016700 Census Tract 167, Kings County, New York 349 177
36047016800 Census Tract 168, Kings County, New York 15 15
36047016900 Census Tract 169, Kings County, New York 169 144
36047017000 Census Tract 170, Kings County, New York 97 54
36047017100 Census Tract 171, Kings County, New York 163 58

- Whole Table

GEOID NAME B19001_001E B19001_001M B19001_002E B19001_002M B19001_003E B19001_003M B19001_004E B19001_004M B19001_005E B19001_005M B19001_006E B19001_006M B19001_007E B19001_007M B19001_008E B19001_008M B19001_009E B19001_009M B19001_010E B19001_010M B19001_011E B19001_011M B19001_012E B19001_012M B19001_013E B19001_013M B19001_014E B19001_014M B19001_015E B19001_015M B19001_016E B19001_016M B19001_017E B19001_017M
36001 Albany County, New York 124108 805 7606 600 6285 524 5578 517 5929 639 5026 419 5819 566 4858 460 5379 444 4663 527 9980 629 12530 723 16017 756 11693 641 7773 467 8093 576 6879 490
36003 Allegany County, New York 18032 295 1282 173 1193 156 1194 157 1114 135 988 120 1122 119 1113 129 1209 152 874 112 1605 155 1916 164 2277 196 1103 114 501 84 320 60 221 48
36005 Bronx County, New York 490740 1515 76719 1775 45398 1629 36856 1357 30086 1030 27848 995 26839 1108 24339 1193 22364 1118 18635 1096 34602 1184 39721 1432 44111 1429 26552 1007 14395 880 12865 638 9410 591

- Iterating

GEOID NAME B01001_003E B01001_003M B01001_004E B01001_004M B01001_005E B01001_005M B01001_006E B01001_006M B01001_027E B01001_027M B01001_028E B01001_028M B01001_029E B01001_029M B01001_030E B01001_030M
36047016700 Census Tract 167, Kings County, New York 288 127 224 104 139 92 94 65 349 177 240 151 85 65 26 32
36047016800 Census Tract 168, Kings County, New York 89 53 31 24 71 42 25 22 15 15 46 29 60 42 82 47
36047016900 Census Tract 169, Kings County, New York 227 102 58 55 114 68 24 29 169 144 227 117 80 51 9 23
GEOID NAME variable estimate moe year
36 New York B03002_001 19398125 NA 2012
36 New York B03002_001 19487053 NA 2013
36 New York B03002_001 19594330 NA 2014
36 New York B03002_001 19673174 NA 2015
36 New York B03002_001 19697457 NA 2016

Wrangling Data

- Normalizing

GEOID NAME variable estimate moe summary_est summary_moe pct
36001 Albany County, New York White 226677 292 307891 NA 73.622483
36001 Albany County, New York Black 36350 726 307891 NA 11.806126
36001 Albany County, New York Native 329 94 307891 NA 0.106856

- Largest variable

GEOID NAME variable estimate moe summary_est summary_moe
36001 Albany County, New York White 226677 292 307891 NA
36003 Allegany County, New York White 45116 40 47700 NA
36005 Bronx County, New York Hispanic 796193 NA 1436785 NA
36007 Broome County, New York White 166815 90 197381 NA
36009 Cattaraugus County, New York White 71489 41 78506 NA
36011 Cayuga County, New York White 71408 85 78783 NA
variable n
Hispanic 2
White 60

Comparing Years

The Census Bureau divides the country into tracts of around 4,000 people that are “designed to be relatively homogeneous units with respect to population characteristics, economic status, and living conditions”. Tracts were introduced in 1910, but only a few cities (New York being one of them) were tracted; it wasn’t until 1990 that the whole country was tracted! Because they are based on population size, tract boundaries can change over time from Census to Census. This makes comparing indicators across time difficult since the same tract number in one dataset may refer to a different geography in a different dataset. Adjusting the tracts enables comparisons of the same area over time.

The most recent change to Census tract boundaries followed the 2010 Census. Thus, Census data (or other data reported at the Census tract level) from 1970-2000 needs to be translated into 2010 Census tracts (or 2010 and later data can be translated back to a previous year’s geography) before a comparison can be made.

The Longitudinal Tract Database (LTDB) is an open-source crosswalk that helps bridge data for Census tracts across time. Each row in the crosswalk contains a tract number from a previous year, a tract number of the current year that overlaps with the previous boundary, and the proportion of the previous tract that is within the current tract to serve as a weight. Old tracts that were split into multiple new tracts have a row for each new tract, so the column containing the old tract numbers does not contain unique values. Similarly, new tracts that are made up of several old tracts have a row for each of the old tracts that were merged, so the column containing new tract numbers does not contain unique values. Only the combination of old and new tract number columns is unique.

Tigris



Visualization